EAN: Event Adaptive Network for Enhanced Action Recognition

نویسندگان

چکیده

Efficiently modeling spatial–temporal information in videos is crucial for action recognition. To achieve this goal, state-of-the-art methods typically employ the convolution operator and dense interaction modules such as non-local blocks. However, these cannot accurately fit diverse events videos. On one hand, adopted convolutions are with fixed scales, thus struggling of various scales. other paradigm only achieves sub-optimal performance action-irrelevant parts bring additional noises final prediction. In paper, we propose a unified recognition framework to investigate dynamic nature video content by introducing following designs. First, when extracting local cues, generate kernels dynamic-scale adaptively events. Second, aggregate cues into global representation, mine interactions among few selected foreground objects Transformer, which yields sparse paradigm. We call proposed Event Adaptive Network because both key designs adaptive input content. exploit short-term motions within segments, novel efficient Latent Motion Code module, further improving framework. Extensive experiments on several large-scale datasets, e.g., Something-to-Something V1 &V2, Kinetics, Diving48, verify that our models or competitive performances at low FLOPs. Codes available at: https://github.com/tianyuan168326/EAN-Pytorch .

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Network Event Recognition

Network protocols can be tested by capturing communication packets, assembling them into the highlevel events, and comparing these to a finite state machine that describes the protocol standard. This process, which we call Network Event Recognition (NER), faces a number of challenges only partially addressed by existing systems. These include the ability to provide precise conformance with spec...

متن کامل

An enhanced method for human action recognition

This paper presents a fast and simple method for human action recognition. The proposed technique relies on detecting interest points using SIFT (scale invariant feature transform) from each frame of the video. A fine-tuning step is used here to limit the number of interesting points according to the amount of details. Then the popular approach Bag of Video Words is applied with a new normaliza...

متن کامل

Adaptive Structured Pooling for Action Recognition

where s ∈ S k and Ψs(p) = 1 if p ∈ s and Ψs(p) = 0 otherwise. M t k is L1-normalized and square-rooted. For a video of T frames: Mk(x, y, t) = { M k (x, y) . . .M T k (x, y) } For each feature xm ∈ X , with (xxm , yxm , txm) as spatiotemporal coordinates of its centroid, weight w m as a local integral of the pooling map Mk: w m = ∫ xxm+vx xxm−vx ∫ yxm+vy yxm−vy ∫ txm+vt txm−vt Mk(x, y, t) dx dy...

متن کامل

Adaptive learning codebook for action recognition

Learning a compact and yet discriminative codebook is an important procedure for local feature-based action recognition. A common procedure involves two independent phases: reducing the dimensionality of local features and then performing clustering. Since the two phases are disconnected, dimensionality reduction does not necessarily capture the dimensions that are greatly helpful for codebook ...

متن کامل

Adaptive Tuboid Shapes for Action Recognition

Encoding local motion information using spatio-temporal features is a common approach in action recognition methods. These features are based on the information content inside subregions extracted at locations of interest in a video. In this paper, we propose a conceptually different approach to video feature extraction. We adopt an entropybased saliency framework and develop a method for estim...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Computer Vision

سال: 2022

ISSN: ['0920-5691', '1573-1405']

DOI: https://doi.org/10.1007/s11263-022-01661-1